Evaluation of Tree-Trellis Based Decoding in Over-Million LVCSR
نویسندگان
چکیده
Very large vocabulary continuous speech recognition (CSR) that can recognize every sentence is one of important goals in speech recognition. Several attempts have been made to achieve very large vocabulary CSR. However, very large vocabulary CSR using a tree-trellis based decoder has not been reported. We report the performance evaluation and improvement of the “Julius” tree-trellis based decoder in large vocabulary CSR (LVCSR) involving more than one million vocabulary, referred to here as over-million LVCSR. Experiments indicated that Julius achieved a word accuracy of about 91% and a real time factor of about 2 in over-million LVCSR for Japanese newspaper speech transcription.
منابع مشابه
Sequential Decoding of Lattice Codes
We consider lattice tree-codes based on a latt ice A having a finite trellis diagram T . Such codes are easy to encode and benefit f rom the structure of A. Sequential decoding of lattice tree-codes is studied, and the corresponding Fano metric is derived. A n upper bound on the running time of the sequential decoding algorithm is established, and found to resemble the Pareto distribution. O u ...
متن کاملDecoding-time prediction of non-verbalized punctuation
This paper presents novel methods that integrate lexical prediction of non-verbalized punctuations with Viterbi decoding for Large Vocabulary Conversational Speech Recognition (LVCSR) in a single pass. We describe two different approaches one based on a modified finite state machine representation of language models and one based on an extension of an LVCSR decoder. We discuss advantages over t...
متن کاملVoice Assimilation Phenomenon and Its Implementation in LVCSR System with Lexical Tree and Bigram Language Model
In this paper a LVCSR system with implementation of the Czech voice assimilation phenomenon is proposed. The recognition system uses lexical trees and a bigram language model. The first part of this article is focused on voice assimilation phenomenon description, triphone lexical tree construction, and voice assimilation impact on LVCSR system performance. The second part outlines lexical tree ...
متن کاملA Broadcast News Corpus for Evaluation and Tuning of German LVCSR Systems
Transcription of broadcast news is an interesting and challenging application for large-vocabulary continuous speech recognition (LVCSR). We present in detail the structure of a manually segmented and annotated corpus including over 160 hours of German broadcast news, and propose it as an evaluation framework of LVCSR systems. We show our own experimental results on the corpus, achieved with a ...
متن کاملSymbol-by-Symbol MAP Decoding of Variable Length Codes
In this paper we introduce a new approach in the decoding of variable length codes. Based on the tree structure of these codes a trellis representation is derived which allows the application of the BCJR algorithm. This algorithm provides us with the a posteriori probabilities of the transmitted source symbols. Therefore we do not only use soft information from an outer decoding stage but also ...
متن کامل